Modular vector systems

ABSTRACT

The present invention provides improved techniques and reagents for producing nucleic acid molecules. In certain preferred embodiments, the nucleic acid molecules are modular vectors. In certain preferred embodiments, the nucleic acid molecules are produced in polymerase chain reactions employing terminator primer residues.

PRIORITY CLAIM AND RELATED APPLICATIONS

The present application is a continuation of U.S. patent applicationSer. No. 10/383,135, now allowed, filed Mar. 5, 2003, which claimspriority to U.S. Provisional Patent Application Ser. No. 60/362,253,filed Mar. 6, 2002. The entire contents of each of these applicationsare incorporated herein by reference. The present application is alsorelated to co-pending applications U.S. patent application Ser. No.09/225,990, filed Jan. 5, 1999, U.S. patent application Ser. No.09/897,712, filed Jun. 29, 2001 (a nationalized applicationcorresponding to PCT/US00/00189, filed Jan. 5, 2000), U.S. patentapplication Ser. No. 09/910,354, filed Jul. 20, 2001, and also to U.S.Patent Application Ser. No. 60/219,820, filed Jul. 21, 2000. The entirecontents of each of these applications are incorporated herein byreference.

GOVERNMENT FUNDING

Some or all of the work described herein was supported by grant numberMCB9604458 from the National Science Foundation and/or grant numberAI48665 from the National Institutes of Health; the Unites StatesGovernment may have certain rights in the invention.

BACKGROUND OF THE INVENTION

Perhaps the classic genetic manipulation in molecular biology is thecleavage of a circular vector with one or more restriction enzymes andthe ligation of a selected insert into the linearized vector. Since the1970s, when the pioneers of molecular biology first demonstrated suchmanipulation to be feasible, significant research effort has beeninvested in the development of improved vector systems (see discussionof vectors derived from plasmids in Ausubel et al., Current Protocols inMolecular Biology, Section II, 1.5.1-1.5.17, John Wiley & Sons, 1998,incorporated herein by reference).

To give but a few examples, plasmid vectors that replicate in differenthosts, with different copy numbers, have been prepared (e.g., bacterialvectors designed to have either relaxed or stringent control ofreplication; yeast vectors with either a 2μ or centromeric replicationorigin, mammalian vectors containing viral [e.g., SV40 or BPV] originsof replication, etc.). Vectors have been engineered to allow readydetection of insertion events (e.g., by creation or disruption of aselectable or detectable marker), to direct high levels of expression ofproteins encoded by inserted sequences (e.g., under the control oftranscription, splicing, and/or translation signals active in a givenhost system), to generate gene fusions that allow analysis of expressionof inserted sequences (e.g., by analysis of B-galactosidase,chloramphenicol transferase, luciferase, or green fluorescent proteinactivity, etc.), or to create fusion proteins with experimentally usefulattributes (e.g., easy purification, desired cellular localization,etc.). Vectors have been designed that are particularly useful fordetermining the sequence of inserted fragments (e.g., by allowing easyproduction of single-stranded DNA), or for producing RNA (sense orantisense) from the inserted sequences. Most companies that sellmolecular biology reagents include among their products vectors thatthey have developed to be particularly useful for designatedapplications (see, for example, catalogs provided by Amersham PharmaciaBiotech, Piscataway, N.J.; Promega Corporation, Madison, Wis.;Invitrogen Inc., Carlsbad, Calif.; Life Technologies, Inc., Rockville,Md.; New England Biolabs, Beverly, Mass.; Stratagene, Inc., La Jolla,Calif.).

Of course, the universe of genetic “vectors” is not limited to circularmolecules derived from bacterial plasmids. Any nucleic acid moleculethat includes sequences sufficient to direct in vivo or in vitroself-replication can be employed as a vector. Typically, suchreplication sequences include a replication origin that directsduplication of the vector sequence in a host system (typically atransformed cell). Alternatively, sequences that direct integration ofthe vector into another nucleic acid molecule that is present in andreplicated by the relevant host system can be sufficient to achievevector (and insert) replication.

Most vectors in use today are derived from naturally-occurring bacterialplasmids, bacteriophages, or other viruses. Some vectors containfeatures of more than one of these systems. Almost all of thecommonly-used vectors contain one or more restriction sites designed forconvenient insertion of fragments; most have at least one polylinker(see, for example, the vector database maintained at the URLvectorbd.atcg.com/vectordb/vector.html, the contents of which as of Jul.19, 2000 are included herein as Appendix A).

Despite the broad availability of vectors from commercial and othersources, each one has features selected by the relevant manufacturerrather than the experimental user. It is not uncommon for a researcherto have to modify an available vector to suit his experimental needs, oralternatively to modify his experimental design to accommodate theavailable vectors. There remains a need for the development oftechniques and reagents that would allow a researcher to readily designand assemble vector(s) appropriate to his experimental needs.

SUMMARY OF THE INVENTION

The present invention encompasses the recognition that vectors arecomprised of modular elements and need not be provided as discretenucleic acid molecules into which fragments of interest are inserted.Rather, vectors can themselves be assembled from pieces that containpart or all of individual useful elements. In certain preferredembodiments of the invention, fragments corresponding to pieces of whatis traditionally viewed as the “vector backbone” are providedindividually and are linked to one another substantially simultaneouslywith the linkage that associates vector sequences with insert sequences.

According to the present invention, components of a vector can bedefined as one of a variety of categories of vector elements. Forexample, sequences that allow the vector to replicate in a host systemmay be classified as “replication elements”. Similarly, sequences thatallow host cells containing a vector to survive experimental conditionsthat kill otherwise identical host cells lacking a vector may beclassified as “replication elements”; sequences that allow detection butnot selection of host cells containing vector sequences, or host cellscontaining vector and insert sequences, may be classified as “detectableelements”; sequences that can act to direct expression (i.e.,transcription, splicing, and/or translation) of other sequences can beclassified as “expression elements”. Other categories of elements mayalso be defined as discussed in further detail herein.

The present invention allows a researcher to select individual elementsfrom one or more categories of vector elements, and to combine theselected element(s) with one or more individual element(s) with oneanother to assemble vectors that contain a desired collection andarrangement of elements. Individual vector elements, or portions orcombinations thereof, are provided on separate “vector fragments” thatare linked together to create the final vector. Thus, the presentinvention provides techniques and reagents useful in the assembly ofvectors from individual vector fragments. Preferably, a vector assembledaccording to the present invention will include at least a replicationelement. More preferably, the vector will include one or more additionalelements selected from the group consisting of additional replicationelements (e.g., effective in different host systems), selectablemarkers, detectable markers, expression elements, fusion proteinelements, mobile elements, recombination elements, cleavage siteelements, etc. The inventive techniques and reagents may be employed tolink two or more vector fragments to one another, serially orsimultaneously, and also to link vector fragments with one or moreinsert fragments (again, serially or simultaneously).

In particularly preferred embodiments of the present invention, one ormore of the vector and insert fragments used in the assembly of a finalhybrid construct is prepared without the use of restriction enzymes (orany endonuclease). Most preferably, substantially all of the fragmentsthat become linked together to produce a final assembled molecule areprepared without the use of restriction enzymes. In particularlypreferred embodiments of the invention, RNA-Overhang Cloning and/or DNAOverhang Cloning are employed to produce vector and/or insert fragments.Also, in certain preferred embodiments of the invention, vectorfragments, and optionally insert fragments, are linked to one another byligation-independent cloning (i.e., without the use of a ligase enzyme).

DESCRIPTION OF THE DRAWING

FIG. 1 depicts assembly of a hybrid molecule comprising λ vectorelements and an insert, according to the present invention.

FIG. 2 shows assembly of a hybrid molecule comprising bacterial vectorelements and an insert in a three-molecule linkage reaction according tothe present invention.

FIG. 3 depicts assembly of a hybrid molecule containing bacterial vectorelements and an insert according to the present invention. Two vectorfragments and one insert fragment are linked together to form a hybridthat can be selected by growth in the presence of tetracycline and lackof growth in the presence of ampicillin.

FIG. 4 depicts assembly of a hybrid molecule comprising bacterial vectorelements and an insert according to the present invention. Two vectorfragments, each of which contains a portion of a detectable element, andone insert fragment are linked together to form a hybrid. Hybrids thatcontain insert can be distinguished from those that do not by ablue/white screen.

FIG. 5 shows assembly of a hybrid molecule containing bacterial vectorelements and an insert according to the present invention. Two vectorfragments, one of which contains a bacterial origin of replication and afirst portion of a LacZ gene and one of which contains an ampicillinresistance gene and a second portion of the LacZ gene are linked to aninsert fragment. Hybrids can be selected by growth in the presence ofampicillin; those containing insert can be distinguished from thoselacking insert by a blue/white screen.

FIG. 6 shows assembly of a hybrid molecule from three vector fragmentsand one insert fragment. Linkage of the four fragments re-creates twovector elements, and operatively links a third (the promoter) with theinsert sequences.

FIG. 7 shows collections of vector fragments, each of which containsonly a single vector element, that may alternatively be linked to eachother and an insert to form a hybrid molecule according to the presentinvention.

FIG. 8 depicts a kit comprising two collections of vector fragments thatcan be used in various combinations to create vectors with differentattributes according to the present invention. The first collection ofvector fragments contains three fragments, each of which includes thepGal promoter and a first portion of a selectable marker selected fromthe group consisting of the URA3, TRP1, and HIS3 genes. The secondcollection of vector fragments contains six different fragments, each ofwhich contains a second portion of one of the selectable markers, and anorigin of replication that is either a centromeric origin or a 2μorigin.

FIG. 9 depicts assembly of a hybrid molecule from two vector fragmentsand one insert fragment, each of which was prepared by DOC, according tothe present invention. Panel A shows the generation of the two vectorfragments; Panel B depicts the ligation of these two fragments with theinsert fragment to produce the final hybrid.

FIG. 10 shows a hybrid molecule assembled from two vector fragments andare insert fragment, each of which was prepared by DOC, according to thepresent invention.

FIG. 11 shows the primers used (3NT5′OST [SEQ ID NO: 34]; 3NT3′OHT [SEQID NO: 35]; 3NT5′KHT [SEQ ID NO: 36]; 3NT3′KST [SEQ ID NO: 37]; 1NT5′ORI[SEQ ID NO: 13]; 1NT3′Ori(s) [SEQ ID NO: 14]; 1NT5′KAN [SEQ ID NO: 11];1NT3′KAN [SEQ ID NO: 12]).

FIG. 12 shows a two-component modular vector system. This system wasused to test various chimeric primers. Single-stranded tails generatedby termination of polymerization are labeled A, A′, B, and B′. Uniquerestriction sites were included in the overhangs to allow for easyidentification of recombinants and verification of junction sequenceintegrity.

FIG. 13 shows that termination of polymerization generatesdouble-stranded DNA with specific single-stranded tails. Panel A shows adiagram of PCR products with tails generated by termination ofpolymerization (CTACCTAGCAAGcuuCAGCCTGAATGGCGAATGG [SEQ ID NO: 38],CCATTCGCCATTCAGGCTG [SEQ ID NO: 42], CGGAGCCTATGGAAAAACGC [SEQ ID NO:39], AAGCTTGCTAGGuagGCGTTTTTCCATAGGCTCCG [SEQ ID NO: 43]).Deoxyribonucleotides are shown in upper case, ribonucleotides are shownin lower case. Panel B depicts the termination of polymerization byribonucleotides and 2′-O-methyl nucleotides. PCR experiments wereconducted with the following polymerases: Pfu (Lanes 1-4), Pfu exo-(lanes 5-8) and Taq (lanes 9-12). The same 32P-labeled primer was usedas a PCR primer in each experiment. Four different unlabeled primerswere used. The four primers were identical except for the inclusion ofone or three ribonucleotides, or a single 2′-O-methyl nucleotide, at aparticular position (see Panel A).

FIG. 14 shows components that were used in assembling modular vectors.The modules were produced by PCR amplification of vector elements usingprimers containing 2′-O-methyl residues at particular positions.Identical overhang sequences are used for modules of analogous function(e.g., drug resistance genes), which make the modules readilyinterchangeable. Panel A shows components used for combinatorialassembly of six vectors. Panel B shows components used for expressioncloning of two different genes into each of six modular vectors.

FIG. 15 depicts the termination of polymerization by singleribonucleotides (lanes 1, 5 and 9), single 2′-O-methyl nucleotides(lanes 2, 6, and 10), three ribonucleotides (lanes 3, 7 and 11), and DNAoligonucleotides (lanes 4, 8, and 12). PCR experiments were conductedwith the following polymerases: Pfu (Lanes 1-4), Pfu exo- (lanes 5-8)and Taq (lanes 9-12). The same 32P-labeled primer was used as a PCRprimer in each experiment. Four different unlabeled primers were used.The four primers were identical except for the inclusion of one or threeribonucleotides, or a single 2′-O-methyl nucleotide, at a particularposition.

FIG. 16 shows a diagram of PCR products with tails generated bytermination of polymerization by three ribonucleotides, or a singleribonucleotide or 2′-O-methyl ribonucleotide (ribonucleotides and2′-O-methyl ribonucleotide are indicated by lower case residues):(CTACCTAGCAAGcuuCAGCCTGAATGGCGAATGG [SEQ ID NO: 38], CCATTCGCCATTCAGGCTG[SEQ ID NO: 42], CGGAGCCTATGGAAAAACGC [SEQ ID NO: 39],AAGCTTGCTAGGuagGCGTTTTTCCATAGGCTCCG [SEQ ID NO: 43],CTACCTAGCAAGCTuCAGCCTGAATGGCGAATGG) [SEQ ID NO: 40], CCATTCGCCATTCAGGCTG[SEQ ID NO: 42], CGGAGCCTATGGAAAAACGC [SEQ ID NO: 41],AAGCTTGCTAGGTAgGCGTTTTTCCATAGGCTCCG [SEQ ID NO: 44]).

DEFINITIONS

“Element”—The term “element” is used herein to refer to a region ofnucleic acid sequence that imparts a particular functional or structuralcharacteristic upon the molecule.

“Expression”—“Expression” of a nucleic acid sequence, as that term isused herein, refers to one or more of the following events: (a)production of an RNA template from a DNA sequence (e.g., bytranscription); (b) processing of an RNA transcript (e.g., by splicing,editing, and/or 3′ end formation); (c) translation of an RNA has beeninto a polypeptide or protein; (d) post-translational modification of apolypeptide or protein.

“Fragment”—A “fragment”, as that term is used herein, is an individualnucleic acid molecule that can be hybridized or linked with one or moreother fragment molecules to produce a hybrid molecule. Preferably, afragment contains at least a portion of a selected sequence element sothat, when the fragments are linked together, a hybrid molecule isgenerated that contains a predetermined collection and arrangement ofsequence elements. In certain preferred embodiments of the invention,each fragment contains at least one intact sequence element. In otherpreferred embodiments, each fragment contains only one intact sequenceelement. In still other preferred embodiments, at least one fragmentcontains only a portion of a particular sequence element (though thefragment may also contain a complete copy of a different sequenceelement). Preferably, that fragment will become linked with anotherfragment so that the complete sequence element is reassembled in thefinal hybrid. Alternatively or additionally, fragments are selected sothat different hybrid molecules can be produced from linkage of the samecollection of fragments, and such different hybrids can be distinguishedfrom one another on the basis of whether a particular sequence elementis recreated in the hybrid. Preferred fragments for use in accordancewith the present invention are prepared without the use of restrictionenzymes. Most preferably they are prepared by polymerase chain reaction(PCR) amplification according to ROC or DOC techniques (see, forexample, U.S. Ser. No. 60/114,909, U.S. Ser. No. 09/225,990, and Coljeeet al., Nature Biotechnology 18:789, July 2000, each of which isincorporated herein by reference in its entirety). Preferred fragmentsare double stranded nucleic acid molecules with at least onesingle-stranded overhang.

“Host system”—A “host system” according to the present invention is anyin vivo or in vitro system into which a vector is introduced.Preferably, the host system is a cell or organism. Any type of cell,including a bacterial cell, yeast cell, plant cell, or animal cell, canbe a host cell. Cells in culture and cells that are part of livingtissues or organisms can also be host cells.

“Hybrid”—A “hybrid” nucleic acid molecule according to the presentinvention is a molecule produced by hybridization and/or linkage of atleast two fragments or elements to one another.

“Linkage”—The “linkage” of two or more nucleic acid molecules to oneanother according to the present invention refers to any reaction thatresults in formation of a covalent bond between two nucleic acidmolecules that were not covalently attached to one another prior to thelinkage reaction. Preferably, the linkage is accomplished either bysplicing or by ligation. Alternatively, linkage may be accomplishedindirectly, for example by replication of molecule pairs (or clusters)held together by ligation but including one or more nicks. Linkage mayoccur in vitro or in vivo.

“Overhang”—An “overhang”, according to the present invention, is asingle-stranded region of nucleic acid extending from a double-strandedregion. Preferred overhangs are at least one nucleotide long.Particularly preferred overhangs are at least 2, 3, 4, 5, 5, 6, 7, 8, 9,10, 11, 12, 13, 14, 15, 20, or 25 nucleotides long. In some preferredembodiments of the invention, the overhangs are comprised of at leastone, preferably at least 2, 3, 4, 5, or more RNA residues; in otherpreferred embodiments the overhangs are comprised of DNA. In someembodiments of the invention, overhangs may comprise RNA elements thatinclude functional intronic sequences.

“Portion”—A “portion” of a nucleic acid molecule or polypeptidemolecule, as that term is used herein, is any piece that is shorter inlength than the entire molecule. Preferably, a portion has a lengthsufficient to be characteristic of the full length molecule. For nucleicacid molecules, preferred portions are usually at least about 3-5residues in length, more preferably at least about 5, 10, 15, 20, 25,30, 35, 40, 45, 50, or 100 residues in length. For polypeptidemolecules, preferred portions are typically at least about 2-5 residuesin length, more preferably at least about 7, 10, 15, 20, 25, 30, or 40residues in length.

“Primer”—The term “primer”, as used herein, refers to a polynucleotidemolecule that is characterized by an ability to be extended against atemplate nucleic acid stand, so that a polynucleotide strand whosesequence is complementary to that of at least a portion of the templatestrand, is produced linked to the primer. Preferred primers are at leastapproximately 5-10 nt long; particularly preferred primers are at leastabout 15 nt long. In many preferred embodiments, primers preferably havea length within the range of about 18-30 nt, preferably longer thanapproximately 20 nt.

DESCRIPTION OF CERTAIN PREFERRED EMBODIMENTS OF THE INVENTION

As described above, the present invention recognizes that vectors neednot be provided as intact, discrete molecules, but rather can beprovided as fragments that contain all or part of particular desiredsequence elements. The invention provides techniques and reagents forthe assembly of vectors (and/or inserts) through the linkage of suchfragments. Certain preferred embodiments of this invention are describedin more detail below.

Vector Elements

As will be appreciated by those of ordinary skill in the art, anydesired nucleic acid sequence can be considered a vector elementaccording to the present invention. Practitioners will be aware of theirown needs and desires in terms of vector functions and attributes, andwill readily be able to select appropriate sequences for use as vectorelements. Nonetheless, certain types of sequence elements are alreadywell established as useful in the field of vector construction. Forexample, Invitrogen Corporation, one of the larger distributors ofmolecular biology reagents, provides on its web site(www.invitrogen.com) a page entitled “Anatomy of a Vector” that liststhe following categories of vector elements: promoters, inducibleelements, transcriptional termination sequences, origins of DNAreplication, affinity purification tags, multiple cloningsites/polylinkers, and selectable markers. The contents of this site, asthey were presented on Jul. 19, 2000, are included herein as Appendix B.

Replication Elements

As described above, any sequence that operates to ensure replication ofvector sequences in a selected host system constitutes a replicationelement. A variety of replication elements are already available in theart, and have been employed in commonly-available vector systems (see,for example, Ausubel et al., Current Protocols in Molecular Biology,Section II, Unit 1.5.1-1.5.17, John Wiley & Sons, 1998, the entirecontents of which are incorporated herein by reference).

It will be appreciated by those of ordinary skill in the art that it isoften desirable to construct a vector containing more than onereplication element. For example, if it is desired that the same vectorbe able to replicate in more than one host cell type (e.g., in bothbacterial cells and mammalian cells), then the vector should be designedto include replication elements that operate in each relevant cell type.On the other hand, it is also known that certain replication elementsare incompatible with one another in a given cell type. It is generallydesirable not to include incompatible elements in a single constructunless fragmentation of the construct in the host cell is desired.

Available replication elements that are known to operate in E. coli, themost commonly employed bacterium in molecular biology, include both highcopy (so-called “relaxed control”) elements such as pMB1 (100-300copies/cell; Bolivar et al., Gene 2:95, 1977), ColE1 (>15 copies/cell;Kahn et al., Method. Enzymol. 68:268, 1979) and p15A (about 15copies/cell; Chang et al., J. Bacteriol. 134:1141, 1978) and low copy(so-called “stringent control”) elements such as pSC101 (about 6copies/cell; Stoker et al., Gene 18:335, 1982), F (1 to 2 copies/cell;Kahn et al., Method. Enzymol. 68:268, 1979), and RK2 (2-4 copies/cell;Kahn et al., Method. Enzymol. 68:268, 1979). The R1 (low copy at 30° C.and high copy above 35° C.; Uhlin et al., Gene 22:225, 1983) repliconalso operates in E. coli, as do various phage origins of replicationincluding λ dv (Jackson et al., Proc. Natl. Acad. Sci. USA 69:2904,1972), m13, f1, etc.

Replication elements that are known to operate in bacteria other than E.coli include RK2 and RSF1010, which have been shown, unlike ColE1, tohave relatively broad host-ranges. In some cases, it may be desirable(or necessary) to introduce vectors into bacterial host cells through amating process, in which case sequence elements encoding certaintrans-acting factors (e.g., the tra or mob genes) may be required, asmay be the cis-acting oriT site.

There are two primary categories of replication elements known tooperate in yeast cells, centromeres and the 2μ replicon. Of course,since DNA can readily be targeted for integration in yeast cells, it isnot always necessary for a vector to be used in yeast cells to includean origin of replication that is active in those cells. Sequences thattarget integration of the vector into other replicating nucleic acidmolecules are sufficient to constitute a replication element accordingto the present invention in those circumstances.

Several viral origins of replication, such as simian virus 40 [SV40],bovine papilloma virus [BPV], and Epstein Barr Virus [EBV], oris areknown to operate in mammalian cells (sometimes requiring the presence ofadditional viral genes) and therefore can be employed as mammalianreplication elements according to the present invention. Alternatively,sequences sufficient to target integration of a vector into anothernucleic acid molecule (e.g., a chromosome or virus) capable ofreplicating in the mammalian cell can be employed. Targeted homologousrecombination has been demonstrated to work effectively in mammaliancells, so that regions of homologous gene sequence can operate asreplication elements according to the invention. Analogously, sequenceelements of the Cre recombinase system can be employed to directintegration of vector sequences in mammalian systems (see, for example,Fukushige et al., Proc. Natl. Acad. Sci. USA, 1992).

Viral origins of replication such as the baculovirus origin are known tooperate in insect cells and can be employed as replication elementsaccording to the present invention, as can other sequences, such asP-element sequences, that enable integration of vector sequences intoother replication-competent nucleic acids.

In certain embodiments of the invention, it will be desirable to providea particular replication element in two parts, on two differentfragments, so that hybrid molecules will only replicate if they containproperly ligated fragments (see, for example, FIGS. 3, 4, and 6). Inother embodiments, replication elements are provided intact on a singlevector fragment (see, for example, FIGS. 2, 5, and 7-9).

Vector Detection Elements

A wide variety of sequences are available that allow host cellscontaining vector to be distinguished from host cells that do notcontain vector. There are two basic categories of such elements: thosethat contain a selectable marker (i.e., one that imparts a growthadvantage to vector-containing cells under certain conditions) and thosethat contain a detectable marker. A wide variety of such markers isavailable, for use in different cell types.

The most commonly employed selectable markers utilized in bacterialsystems are those that confer resistance to antibiotics such asampicillin, chloramphenicol, kanamycin, and tetracyline. Similarly,selectable markers commonly utilized in insect and/or mammalian cellsinclude those that confer resistance to zeocin, neomycin, blasticidin,or hygomycin. The DHFR gene, which confers the ability to grow in theabsence of exogenous purines (and also confers resistance tomethotrexate, can also be used as a selectable marker in a range of celltypes including mammalian cells. Also, cytosine deaminase can be used asa selectable marker under conditions that require cells to convertcytosine to uracil for growth. Other selectable markers useful inmammalian cells include, for example, hygromycin-β-phosphotransferase(HPH), puromycin-N-acetyl transferase (PAC), thymidine kinase (TK), andxanthine-guanine phosphoriboseultransferase (XGPRT).

The most commonly employed selectable markers utilized in yeast cellsinclude those that confer the ability to grow in the absence of a givennutrient such as uracil, tryptophan, histidine, leucine, lysine, etc.

Preferred detectable markers for use in accordance with the presentinvention include genes encoding proteins that produce detectableproducts. Commonly employed detectable markers include, for example, theβ-galactosidase gene, the green fluorescence protein gene, the horseradish peroxidase gene, the nitric oxide syntheses gene, thechloramphenicol acetyl transferase gene, the luciferase gene, etc.

Those of ordinary skill in the art will readily appreciate that most orall of these vector detection elements can alternatively be employed asinsert detection elements. For example, FIGS. 3-5 depict inventivereactions in which vector fragments are designed so that, if they becomelinked to one another, a vector detection element is created. On theother hand, if an insert fragment becomes linked between them, thevector detection element is not created. Thus, constructs containing theinsert fragment and those not containing the fragment can readily bedistinguished from one another.

Similarly, those of ordinary skill in the art will appreciate that itwill often be desirable to design vector and/or insert fragments so thata vector detection element is only created if the fragments becomelinked together in the desired arrangement. FIG. 6, for example, depictsa particular embodiment of the invention in which this strategy wasemployed to simplify hybrid construct production according to thepresent invention.

It should be noted that one advantage of the present invention is thatit renders the insert detection strategies described in the previous twoparagraphs particularly practicable. The inventive modular approach tovector assembly, and particularly the inventive employment of cloningtechnologies that do not require restriction digestion, removes the needfor a polylinker in order to introduce insert sequences into a vector.Since polylinkers add unnatural sequences, their location in the middleof a detectable or selectable gene typically disrupted the geneactivity, so that it was not possible to use reverse selection ordetection to assay for insert insertion. By contrast, the inventivetechnologies allow the seamless union of insert and vector sequences,making feasible the use of these convenient screens and selections.

Expression Elements

As will be appreciated by those of ordinary skill in the art, one of themost common uses of vector systems in molecular biology is to arrangefor expression of insert sequences in a host cell of interest. Anysequence that participates in directing or regulating expression of alinked sequence can be an expression element according to the presentinvention. A wide variety of such sequences are known in the art;certain examples are discussed in more detail below.

PROMOTER: Promoters are the regions of DNA that are responsible forestablishing the initiation site for transcription. A variety ofdifferent promoters, operative in different systems, have been definedand characterized. Different promoters may direct expression of linkedsequences at different levels. Furthermore, some promoters areconstitutively active, while others can have their activity modulatedthrough adjustment of the experimental conditions. Some promoters areactive in only particular cell types, where as others are ubiquitouslyexpressed.

Preferred promoters known to be active in bacterial cells include, forexample, P_(BAD), P_(L), P_(R), lack, tack, trc, spa lacUV5, T3, T7, T7LAC, SP6, etc.; preferred promoters known to be active in yeast cellsinclude, for example pGAL1, pAOX1, pADH, etc.; preferred promoters knownto be active in insect cells include, for example, the MT, Ac5, andpolyhedrin promoters, etc; preferred promoters known to be active inmammalian cells include, for example, P_(ΔHSP), P_(SG), P_(CMV),P_(EF-1α), P_(SV40), P_(RSV), P_(PGK), P_(MMTV), P_(MC1) etc.

ENHANCERS/TRANSCRIPTIONAL REGULATORS: Regulator sequences that operateto stimulate or repress transcription from a given promoter in certaincell types or under certain conditions can often be combined with any ofa variety of different promoters to create a transcription controlelement with useful characteristic. The universe of known regulatorysequences operative in different organisms is very large. Particularlypreferred elements that are commonly used in vector systems include, forexample, the lac operon, the λ cI site, the tet operon, lexA sites, Gal4sites, the SV40 enhancer, the MMTV enhancer, etc. Those of ordinaryskill in the art will immediately recognize the huge range ofalternative sequences that could be employed in the practice of thepresent invention. Experiments to define additional such sequences,operative in the context of any particular experiment, are routine.

TRANSCRIPTION TERMINATOR: Although not required, it is sometimesdesirable to include in an expression vector sequences that willterminate transcription of relevant sequences at a selected point.Without such termination signals, it may be possible for RNApolymerases, at least under some circumstances, to transcribeindefinitely around a circular construct. A variety of differenttranscriptional termination sequences have been identified; the one mostcommonly used in vector applications is probably the SV40 terminator.Alternatively or additionally, 3′-end formation signals, such aspolyadenylation sites, may be employed.

SPLICING SIGNALS: In certain circumstances, it may be desirable toinclude in inventive expression vectors signals that can direct splicingof transcripts encoded by insert sequences. For example, if a vectorincludes a promoter and exonic sequences including a splice donor site,then insert sequences containing a splice acceptor site can be expressedand translated. In certain embodiments of the invention, it might bedesirable to provide a collection of vectors or vector fragments (3)that contain the splice acceptor site in all three possible frames, soas to ensure in-frame fusions of insert sequences in one version of thevector, regardless of whether information about the insert sequence isavailable.

TRANSLATION START: Often, if expression of an insert that does notinclude 5′ sequences (or is not known to include such sequences) isdesired, it will be useful to include translation start sequences. Theconsensus translation start sequence, known as the Kozak sequence, willprovide the strongest translation initiation signal, but in most cases asingle ATG reasonably positioned with respect to the start of thetranscript will suffice.

TRANSLATION STOP: Expression vectors designed to express insertsequences that may be lacking their natural 3′ ends often benefit fromthe inclusion of translation stop sequences. As with the translationstart and splicing sequences, families of vectors can be preparedcontaining the relevant sequences in all three possible frames so thatknowledge of the insert sequence is not required. Alternatively, asingle vector could be employed but families of insert fragments can beprepared with additional (or fewer) nucleotides on one or both ends.

Gene Fusions

As those of ordinary skill in the art will be aware, a variety of vectorsystems have been engineered to generate gene fusions between insertsequences and a reporter gene in the vector backbone. Such fusions areuseful, for example, to detect expression patterns of the insertsequences, or to detect expression control elements that may be presentin the insert sequences. Gene fusions may also allow a researcher totrack the expression products of the fused gene.

Particularly preferred detectable genes for use in gene fusionapplications include, for example, LacZ, chloramphenicol acetyltransferase (CAT), green fluorescence protein (GFP), luciferase, horseradish peroxidase (HRP), etc.

Fusion Proteins

One version of gene fusions that is particularly commonly employed invector systems is fusions that generate fusion proteins with a desirablecharacteristic. As will be appreciated, it will often be desirable toprovide families of vectors or vector fragments that allow C-terminal,N-terminal, or internal fusions, and also that allow fusions in allpossible frames, preferably without knowledge of insert sequence.

For example, a variety of sequence elements are available that encodepolypeptides that, when fused to a polypeptide encoded by an insertsequence, allow that polypeptide to be readily purified. Particularlypreferred purification tags include, for example, (His)₆, thioredoxin,glutathione-S-transferase, streptavidin, staphylococcal protein A (whichinteracts strongly with IgG; Amersham Pharmacia Biotech, Piscataway,N.J.), etc.

Also available are a variety of sequence elements encoding detectablemoieties, such as epitopes for which high-specificity antibodies areavailable, that can be useful in the detection of an expression fusionprotein. Examples of such detectable epitopes include, for example,Xpress™, c-myc, CA25, thioredoxin, V5, HA, calmodulin binding peptide(CBP), Aag, etc.

In some cases, it is desirable to remove the protein tags created byfusion of encoding insert sequences with encoding vector sequences.Sequence elements encoding polypeptide cleavage elements (e.g., byfurin, enterokinase, thrombin, factor X1, PreScission, etc.) areparticularly useful in such applications.

Other useful sequence elements for the production of fusion proteins areones that encode targeting moieties, such as secretion signals (e.g.,BiP for insect cells, human placental alkaline phosphatase or humangrowth hormone for mammalian cells, protein A for bacterial cells, etc.)or other elements, that direct the fusion product to a particularcellular location. Examples of such targeting sequences include, forinstance, yeast AgA2 sequences that target the fusion protein to thecell surface, VP22 fusions that target to the mammalian nucleus,pRLT3-NLS, COXVIII signal, etc.

Polylinkers

One virtually ubiquitous element in most commercially-available vectorstoday is a so-called “polylinker” or “multiple cloning site”. In certainembodiments of the invention, it may be desirable to include vectorfragments containing such elements in linkage reactions. However, inmany embodiments, it will be desirable to create fragments and/or hybridmolecules without employing the use of restriction enzymes. Astechniques for such restriction-free nucleic acid manipulation becomemore accepted, the need for polylinkers in inventive vectors andreactions will diminish.

Other Elements

Those of ordinary skill in the art will readily appreciate that any of avariety of other sequence elements may be included in vector fragmentsaccording to the present invention. The foregoing has been intended toprovide merely a sampling of certain examples of sequence elements thatare currently commonly found in vector sequences. One of the advantagesof the present invention is that, by providing techniques and reagentsthat allow the ready production of specifically designed vectors throughthe assembly of prepared fragments, it is expected that the inventionwill also help researchers expand the range of sequence elementsutilized in vector applications.

Insert Elements

As will be apparent to those of ordinary skill in the art, any nucleicacid sequence may be employed as an insert element according to thepresent invention. A researcher may choose any sequence or sequencess/he likes to be linked to vector sequences. Also, more than one insertelement may be employed. Furthermore, each insert element may beprovided as a single insert fragment, or may be distributed overmultiple insert fragments that will be linked together in series in thefinal hybrid product. In certain embodiments of the invention, part orall of a given insert element may even be prepared as a single fragmentthat also includes part or all of one or more vector elements. Anycollection of contiguous insert sequences is considered a single insertelement for the purpose of the present invention.

Those of ordinary skill in the art will recognize that theclassification of particular sequences as “insert elements” as comparedwith “vector elements” is not critical to the invention. In fact, theinventive recognition that a “vector” need not be a single discretemolecular entity in a sense renders such distinctions arbitrary.Nonetheless, both the concept and the terminology of a “vector backbone”and an “insert” are well established in molecular biology and thereforecan be useful for the purposes of clarity and communication.

Preparation and Linkage of Fragments

In general, any method may be used to prepare fragments forhybridization and/or linkage according to the present invention.However, it is preferred that, for each hybrid molecule to be assembled,at least one fragment is prepared without the use of restrictionenzymes, and preferably without the use of any endonuclease.

In certain preferred embodiments of the invention, fragments areprepared in a form that allows them to be linked together by ligation.In other embodiments, fragments are prepared in a manner that allowsthem to be linked together by splicing. In particular, U.S. patentapplications U.S. Ser. No. 08/814,412, U.S. Ser. No. 09/399,593, U.S.Ser. No. 09/225,990, and PCT/US00/0189, and U.S. Pat. Nos. 5,498,531 and5,780,272, each of which is incorporated herein by reference, containthorough descriptions of methods and strategies useful in thepreparation of nucleic acid (RNA or DNA) fragments that contain flankingintronic sequences and can be linked to one another by trans- orcis-splicing. In yet other embodiments, fragments are prepared in a formthat allows topoisomerase-mediated linkage.

Often, it will be desirable to prepare fragments so that, for eachlinkage reaction to be performed in the assembly of a hybrid molecule,the fragments are designed to associate with one another in only one wayand to produce only a single major linkage product. For example,fragments may be prepared so that each has single-stranded overhangs onone or both ends, and only fragments that are to be adjacent to oneanother in a hybrid molecule have complementary overhangs. Alternativelyor additionally, fragments may be engineered to include intronicelements that are only compatible with the intronic elements on adjacentfragments. Such “directed linkage” (i.e., linkage in only onearrangement) of fragments discussed above is particularly desirablewhere multiple fragments (i.e., three or more, preferably four or more,and more preferably five or more) are to be linked together in a singlelinkage reaction. For linkage reactions containing small numbers offragments (2 or 3), directed linkage can be assured by controlling thephosphorylation state of the relevant fragment ends.

In other preferred embodiments of the invention, it may be desirable toprepare fragments so that they can become linked to one another in anyof a variety of different ways. This phenomenon is referred to herein as“linkage degeneracy”. In such embodiments, a single linkage reaction cangenerate a “library” of different hybrid molecules that can subsequentlybe distinguished and/or separated from one another as desired.

In yet other preferred embodiments of the invention, fragments can bedesigned for directed ligation as described above, but then multiplealternative versions of each particular fragment can be provided in thesame linkage reaction so that, once again, a library of hybrid moleculesis produced in a single linkage reaction. This phenomenon is referred toherein as “selection degeneracy”. For example, fragments A, B, and C canbe designed and prepared so that they can only be linked to one anotherin the arrangement ABC (which can be a linear or a circulararrangement). If multiple different A fragments (e.g., A1, A2, A3, . . .An), multiple different B fragments, and/or multiple different Cfragments are employed in a single linkage reaction, then a library ofdifferent hybrid molecules, each having an ABC structure, will beproduced in that reaction (e.g., A1B17C3, A1B1C1, A1B2C1, etc.). Thoseof ordinary skill in the art will readily appreciate that the differentversions of the A fragment need not bear any relationship to one anotherother than being designed to be link only to a B fragment, etc.Alternatively, each version of a given fragment could, for example,contain different varieties of the same vector element(s) or elementportion(s) (e.g., different drug resistance genes).

Still other preferred embodiments combine the two kinds of degeneracydiscussed above, so that a single linkage reaction may create a libraryof hybrid molecules in which both the arrangement and selection offragments is varied.

According to the present invention, particularly preferred fragments foruse in accordance with the present invention contain one or moresingle-stranded overhangs available for hybridization with complementaryoverhangs on other fragments. It is most preferred that suchoverhang-containing fragments be prepared without the use of restrictionenzymes. It is particularly preferred that such fragments be preparedusing RNA-Overhang Cloning (ROC) or DNA-Overhang Cloning (DOC), asdescribed for example in U.S. Ser. No. 09/225,990; PCT US00/00189; andU.S. Ser. No. 09/478,263, each of which is incorporated herein byreference in its entirety (see also Examples 5-8).

Once a hybrid molecule is created by hybridization or linkage of vectorand/or insert fragments, it may be replicated by any available in vitroor in vivo mechanism. In certain preferred embodiments of the invention,hybridization or linkage reactions themselves, or isolated or purifiedhybrids prepared from such reactions (e.g., by gel electrophoresis), maybe directly transformed or transfected into host cells (or otherwiseintroduced into a host system). In some cases, it may be desirable toperform one or more manipulations prior to introducing a hybrid moleculeinto a host cell. For example, linkage of two fragments created usingsome embodiments the ROC methodology will produce a hybrid molecule thatincludes some regions of double-stranded RNA that may not be stableinside certain host cells. Accordingly, it may be desirable to performat least a single round of DNA replication of such a hybrid prior tointroducing it into a cell. Other circumstances in which such additionalmanipulations (e.g., nick repair, etc.) are desirable will be apparentto one of ordinary skill in the art.

Kits

As discussed herein, one aspect of the present invention comprises therecognition that vectors are comprised of modular elements and can beassembled from individually prepared fragments. One part of thisrecognition includes the realization that vector fragments can beprovided as isolated cassettes, ready to be assembled by a user or amanufacturer.

In one embodiment of the invention, a variety of different possiblevector elements is offered to a user who selects particular pieces ofinterest. Fragments that together comprise these pieces are thenprepared and are provided to the user for assembly into a vector.Optionally, reagents for performing the assembly (e.g., ligase if thefragments are prepared with overhangs amenable to linkage by ligation;splicing reagents if the fragments are prepared for linkage by splicing;etc.). Alternatively, the fragments may be linked together into a“designer vector” before being provided to the user.

In other embodiments of the invention, kits are provided that containmultiple optional fragments, each of which contains a selected vectorelement or elements, or fragment(s) thereof, so that a user can readilyassemble any of a variety of different vectors my mixing differentcollections of fragments together in linkage reactions. For example, abacterial expression vector kit could be provided that contains (a) afirst collection of first fragments, each of which contains the pTacpromoter and also contains a portion of an antibiotic resistance gene,where different fragments in the collection contain portions ofdifferent antibiotic resistance genes; (b) a second collection of secondfragments, each of which contains the remainder of one of the antibioticresistance genes and also contains the ColE1 origin of replication; and(c) ligation reagents. A user could then select particular first andsecond fragments that, when ligated with his or her chosen insertfragment(s), would create a hybrid containing a chosen antibioticresistance gene and the insert element under control of the pTacpromoter. Those of ordinary skill in the art will immediately recognizethe infinite variety of related other kits that could alternatively oradditionally be provided.

The inventive recognition of vector modularity also provides a newperspective on valuable reagents, and systems for providing suchreagents to users. For example, in addition to kits as discussed above,reagent providers could prepare catalogs or menus (either paper orelectronic) from which users can select particular desired vectorelements or fragments. In certain preferred embodiments of theinvention, the catalog or menu is presented on a World Wide Web sitethat the user can access and through which the user can place an order.In other embodiments a paper form is provided, or information abouttelephone contact is provided. As discussed above, selected fragmentsmay be provided to the user as isolated fragments, as fragmentcollections, as linked pieces (e.g., complete vectors), as kits (e.g.,including linkage reagents, purification reagents, amplificationreagents, instructions for use and/or other relevant materials), or inany other desirable form. The invention therefore provides, in additionto the various techniques and reagents discussed herein, new methods ofdoing business in the area of molecular biology reagents.

Hybrid Molecules

As discussed herein, the techniques and reagents provided by the presentinvention allow the ready assembly of any of a variety of hybridmolecules, generated by hyrbidization and/or linkage of vector and/orinsert fragments. In some embodiments of the invention, a vector isassembled from vector fragments (via one or more than one linkagereactions) prior to linkage of vector sequences with insert sequences.In other embodiments, assembly of the complete, final hybrid product isaccomplished in a single linkage reaction. In other embodiments, one ormore linkage fragments is/are linked to one or more vector fragments ina first linkage reaction, and one or more additional linkage reactionsare subsequently performed to add additional vector and/or insertfragments. Each and every hybrid molecule produced in such a linkagereaction is encompassed within the scope of the present invention.

EXAMPLES Example 1 Assembly of a λ Vector/Insert Hybrid

FIG. 1 presents an inventive reaction for the assembly of a hybridmolecule containing two λ phage arms (a λ cloning vector) separated by achosen insert. As is well known, λ vectors are particularly useful forthe cloning of relatively large (up to about 50 kB) fragments. Theinsert-containing hybrids can be packaged (typically through the use ofhelper phages) into phage heads in vitro. Although the efficiency ofpackaging can be relatively low (around 10%), the subsequent efficiencyof genome transfer into bacteria through infection is close to 100%(see, for example, Ausubel et al., Current Protocols in MolecularBiology, Unit 1.10, Current Protocols, 1987, the entire contents ofwhich are incorporated herein by reference).

Example 2 Assembly of a Bacterial Vector/Insert Hybrid

FIG. 2 presents an inventive reaction for the assembly of a hybridmolecule containing a bacterial origin of replication, an antibioticresistance gene, and a chosen insert. The hybrid molecule is assembledby linkage of three fragments, each of which contains a single element.Preferably, the fragments are prepared to have complementary overhangsselected to provide for directional ligation. Alternatively, theindicated element in each fragment may be flanked by intronic componentsthat direct appropriate trans-splicing reactions in vivo or in vitro.

Example 3 Assembly of a Bacterial Vector/Insert Hybrid in Which theInsert Disrupts a Detectable Element

The inventive reaction depicted in FIG. 3 differs from that shown inFIG. 2 (and discussed above in Example 2) in at least two ways. First,the two vector fragments employed in the reaction of FIG. 3 each containa part of the bacterial origin of replication, so that only hybridmolecules in which these two fragments are properly linked together willbe able to replicate in bacteria. Also, each vector fragment contains aportion of the ampicillin resistance gene. If a hybrid is assembled thatdoes not include an insert, the ampicillin resistance gene will bere-created (unless some mutation occurs) and bacteria containing theresulting hybrid will be resistant to both tetracycline and ampicillin.By contrast, the ampicillin gene will not be re-created in hybrids thatdo contain the insert. Thus, bacteria containing complete hybrids willbe distinguishable from those containing partial hybrids that lackinsert because those containing complete hybrids will be resistant totetracycline but not ampicillin, whereas those containing partialhybrids will be resistant to both.

The strategy depicted in FIG. 3 is particularly useful for fragmentsthat do not have directionally specific ends. For example, ifblunt-ended fragments are to be employed, or if the both ends of theinsert fragment (and both ampicillin fragment ends) contain identicaloverhangs, the ability to identify desirable hybrids from the universeof possible hybrids is particularly useful.

The strategy depicted in FIG. 4 is analogous to that depicted in FIG. 3except that hybrids containing insert are distinguishable from thoselacking insert on the basis of a blue/white screen rather than agrowth/no growth screen.

The strategy depicted in FIG. 5 is also similar, except that linkage ofthe vector fragments is not required to create a functional origin ofreplication. For this strategy, it is generally preferred that at leastthe vector fragments be engineered for directional linkage, so that theycan only be linked to one another in a single orientation.

Example 4 Assembly of a Hybrid Bacterial Expression Vector/InsertConstruct by 4-Way Ligation

The inventive strategy depicted in FIG. 6 shows simultaneous linkage ofthree different vector fragments with an insert fragment. A hybridvector molecule containing both an origin of replication and anampicillin resistance gene can only be assembled through proper linkageof the three vector fragments. Thus, selection strategies can beemployed to identify desirable hybrid molecules. Such molecules can thenbe screened for expression of the insert in order to identify those thatare complete as compared with those that contain only vector sequences.

Example 5 Assembly of a Hybrid Bacterial Vector/Insert Molecule UsingDOC

The inventive scheme depicted in FIG. 9 was carried out as follows.Vector fragments were amplified from the pET 19b vector (Novagen,Madison, Wis.) using the following primers (lower case letters indicateRNA residues; upper case letters indicate DNA residues): EV-1(5′-cauGGTATATCTCCTTCTTAAAG; SEQ ID NO: 1), EV-2(5′-cucATGACCAAAATCCCTTAAC; SEQ ID NO:2), EV-3(5′-gagATTATCAAAAAGGATCTTC; SEQ ID NO:3), and EV-4(5′-uaaCTAGCATAACCCCTTGG; SEQ ID NO:4). EV-1 and EV-2 were used togetherto generate vector fragment 1, containing the bacterial origin ofreplication, the LacI gene, and the pT7 promoter; EV-3 and EV-4 wereused together to generate vector fragment 2, containing the Amp gene.

In a separate DOC reaction, an insert fragment containing the Lac Z genewas amplified from the pBluescript II SK (−) vector (Stratagene, LaJolla, Calif.), with primers 5′ Lac Z (5′-augACCATGATTACGCCAACG; SEQ IDNO:5) and 3′Lac Z (5′-uuaCAATTTCCATTCGCCATTC; SEQ ID NO:6). 100 μl PCRreactions contained 5 ng of template DNA, 1× cloned PFU buffer(Stratagene, La Jolla, Calif.), 1 mM MgSO₄, 200 μM of each dNTP, 1.45 Ucloned PFU (Stratagene), 1.25 U PFU Turbo™ polymerase and 50 pM of eachprimer. Reactions were performed in a Robocycler (Stratagene, La Jolla,Calif.) as follows: 1 cycle 95° C., 5 min; 53° C., 3 min; 72° C., 6 min(10 min for vector fragment 1); 30 cycles, 95° C., 1 min; 53° C., 1 min;72° C., 3 min (8 min for vector fragment 1); and 1 cycle 72° C. 10 min.

Products of the PCR reactions were separated on a 1% agarose gel, andpurified using the GENECLEAN II kit (Vista, Calif.). 12 μl of eachpurified fragment was placed separately in 1× first strand buffer (LifeTechnologies, Rockville, Md.) with 10 mM DTT, 5 mM of each dNTP, and 200U M-MLV (Life Technologies). Reactions were incubated for 20 min at 42°C. Reactions were then placed at 70° C. for 10 min to heat kill theenzyme.

Primer ribonucleotides were removed from the PCR products by hydrolysiswith NaOH. 6 μL of 1 N NaOH were added to each reaction, and themixtures were incubated for 30 min at 45° C. 6 μl of 1 N HCL, 4 μl of10× ligase buffer (USB, Cleveland, Ohio), and 10 U of T4 PNK (USB) werethen added. Reactions were incubated at 37° C. for 30 min.Phosphorylated fragments were combined in equimolar amounts(approximately 50 ng) and ligated with 10 U of T4 DNA ligase (USB) at25° C. for 2 hrs. 5 μl of the ligation reaction was then transformedinto E. coli.

Example 6 Assembly of Hybrid Vector/Insert Molecules Using ROC withInternal Terminators

We prepared hybrid vectors containing an origin of replication (Ori)fragment and a kanamycin resistance gene (KAN) fragment, by amplifyingeach fragment with primers that contained one or more residues notcopied by the DNA polymerase utilized in the reaction (i.e.,“terminator” residues). The Ori fragment was amplified from pET19b(Novagen, Madison, Wis.); the KAN fragment was amplified from pCR 2.1(Invitrogen, Carlsbad, Calif.). FIG. 11 shows the various primers usedand fragments generated. As will be seen, some reactions generated a2400 bp ori fragment; others generated an 824 bp fragment, (denoted“Ori(s)” because it is smaller). The smaller fragment, Ori(s), lacks an11 bp direct repeat that can create a deletion hotspot when it ispresent.

PCR reaction cycling, product annealing and E. Coli transformation wereperformed as described in Examples 7 and 8.

Example 7 Assembly of a Hybrid Vector/Insert Molecule Using ROC withSingle Nucleotide Terminators

The vector/insert hybrid molecule depicted in FIG. 10 was generated asfollows. The ori-containing vector fragment was amplified from pET 19b(Novagen, Madison, Wis.) using primers (lower case letters indicate RNAresidues; upper case letters indicate DNA residues) 5′OST(5′-CTGCTAAGTGAGcucGACAGATCGCTGAGATAGGTGC; SEQ ID NO:5) and 1N3′Ori(s)(5′-AAGCTTGCTAAGTAgGGCGTTTTTCCATAGGCTCCG; SEQ ID NO:6)

The vector fragment containing the Kanamycin resistance gene wasamplified from pCR2.1 Topo (Invitrogen, Carlsbad, Calif.) using primers1NT5′KAN (5′ CTACCTAGCAAGCTuCTATCTGGACAAGGGAAAACG; SEQ ID NO:7) and T73′KAN (5′CCCTATAGTGAGTCGTATTAaGGCGAAAACTCTCAAGGATC; SEQ ID NO:8).

The insert fragment containing the luciferase gene was amplified frompGI II basic (Promega, Wis.) using primers tCS1(5′TTAATACGACTCACTATAGGGATGGAAGACGCCAAAAACATA; SEQ ID NO: 9) and tCS8(5′-GAGCTCACTTAGCAGTTACAATTTGGACTTTCCGCC; SEQ ID NO: 10).

Each 100 μl reaction contained 50 pM of each primer, 1× cloned PfuBuffer (10 mM (NHy)₂SO₄, 20 mM Tris (pH 8.8), 2 mM Mg SO₄, 10 mM KCE,0.1% Triten x-100 and 0.1 mg/me Bovine serum Albumin), 1 mM additionalmg SO₄, 0.3 mM each dNTP, 5-10 ng template DNA and 1.25-1.85 units ofboth cloned Pfu and Pfu Turbo polymerases (strategies, La Jolla,Calif.). The Ori fragment was amplified in a reaction involving (1) onecycle of 95° for 3 min; 46-60° for 2 min; (2) 35 cycles of 95° for 30sec; 48-60° for 30 sec; and 72° for 3 min; and (3) one cycle of 95° for30 sec; 48-60° for 30 sec; and 72° for 8 min. The KAN and LUC fragmentswere amplified in similar reactions except that the 35 cycles containeda 4.5 min 72° step.

PCR products generated in these reactions were gel purified using theQiaquick gel extraction kit (Qiagen, Valencia, Calif.). Approximately 80ng of each fragment was combined in a 20 μl reaction volume. Two (2) μl10×USB ligation Buffer (660 mM Tris-HCL (pH7.6), 66 mM MgCl₄, 100 mMDTT, 660 μM ATP) (USB, Cleveland, Ohio) was then added, to make a 1×reaction mix. The reaction was heated to 65° C. for 8 minutes, and thenslow cooled for 20 minutes (to 35-40° C.) to allow the fragments toanneal. Samples were spun and allowed incubate another 15 minutes atroom temperature.

The annealing reaction was precipitated by adding 100 μl of 100%ethanol, followed by a 15 minute incubation at −80° C., and a 70% and100% wash. Electrocompetent DH5α cells were transformed using a BioradE. coli pulser (Biorad, Hercules, Calif.). Five (5) μl of each annealingreaction was combined with 40 μl of Elexctromax DH5α-E cells (Lifetechnologies, Rockville, Md.) Individual clones generated in thisexperiment were isolated, restriction mapped, and sequenced; alljunctions were correct.

Those of ordinary skill in the art will appreciate that, as with Example6, the ROC technique described in this Example utilizes primerscontaining internal ribonucleotide residues (in one case, 3 residueswere used; in other cases only one) flanked by DNA residues. Theoverhangs created in these ROC PCR reactions, therefore, have only asingle “ribo” residue; other overhang residues are DNA. In separateexperiments, we have demonstrated that any individual ribonucleotide(i.e., rA, rG, rU, or rC) can act effectively to block extension of acomplimentary strand by an appropriate DNA polymerase, so that overhangsare created (see, for example, Example 6). We have also showed thatsingle 2′-O-methyl residues are similarly effective (see Example 10).Primers containing 2′-O-methyl residues can often be synthesized moreeasily (e.g., due to higher coupling efficiencies) than those containingribonucleotides, and will generally be more stable, so that they arepreferred for many applications.

Example 8 Streamlined Cloning

Inventive modular vector fragments may be prepared, annealed together,and transformed into host cells, without enzymatic ligation. Forexample, we assembled a two-fragment vector, by preparing one fragment(KAN) containing the kanamycin resistance gene, and one fragment (Ori)containing an origin of replication.

Specifically, two 100 μl PCR reactions were performed to amplify each ofthe two components of the vector. Each reaction contained 50 pM of eachprimer, 1× Cloned Pfu Buffer (10 mM (NH₄)₂SO₄, 20 mM Tris (pH 8.8), 2 mMMgSO₄, 10 mM KCl, 0.1% Triton X-100 and 0.1 mg/ml bovine serum albumin),1 mM additional MgSO₄, 0.3 mM of each dNTP, 5-10 ng of plasmid templateand 1.25-1.85 units of both cloned Pfu and Pfu Turbo polymerases(Stratagene, La Jolla, Calif.).

The following chimeric RNA/DNA primers were purchased from Oligo's Etc.(Willsonville, Oreg.): (ribonucleotides are in lower case) 1NT5′KAN-CTACCTAGCAAGCTuCTATCTGGACAAGGGAAAACG (SEQ ID NO: 11) 1NT3′KAN-GAGCTCACTTAGCAAGGCGAAAACTCTCAAGGA (SEQ ID NO:12)1NT5′Ori-TTGCTAAGTGAGCTcGACAGATCGCTGAGATAGGTGC (SEQ ID NO:13)1NT3′Ori(s)-AAGCTTGCTAAGTAgGGCGTTTTTCCATAGGCTCCG (SEQ ID NO:14). Primers1NT 5′KAN and 1NT 3′KAN were used to amplify the Kan fragment from pCR2.1 Topo (Invitrogen, Carlsbad, Calif.). Primers 1NT5′Ori and1NT3′Ori(s) were used to amplify the Ori fragment from pET 19b (Novagen,Madison, Wis.). The following cycles were performed: one cycle of 95°,3′, 48-60°, 2′, 72°, 8′; followed by 35 cycles of 95°, 30 sec, 48-60°,30 sec, 72°, 3′ for Ori fragment, 4.5′ for Kan and Luc fragments. Afinal cycle with an 8′ 72° step was performed in all cases.

Approximately 80 ng of each fragment (5 μl each) produced in the PCRreactions was combined and mixed. 5 μl of this reaction was thentransformed into 100 μl of chemically competent DH5α cells without a DNApurification step or an annealing step. Positive clones were isolatedand mapped; sequence junctions appear to be correct.

Example 9 Varying the Length of the Overhang

Ligation dependent experiments were performed in which the length of theoverhang was shortened to six, three and one nucleotide(s). Theseexperiments used chimeric primers consisting of a variable number of 5′DNA nucleotides followed by a single ribonucleotide and a stretch oftemplate binding DNA nucleotides. Thus, the overhangs included five, twoor zero DNA nucleotides, respectively. All three of these primerconfigurations produce accurately ligated vectors, however, the yield ofcolonies was approximately 170-fold lower when a single ribonucleotideoverhang was used (850,000 colonies per μg vs. 5,000 colonies per μg).DNA sequence analysis revealed one case of a duplicated junction,apparently created by blunt end ligation of PCR products, suggestingthat the terminator nucleotide was being read through at some frequency.Blunt ended fragments had not been detected in previous experiments,most likely due to the ligation-independent procedure that was used.

Example 10 Cloning with 2′-O-methyl Terminator Primers

Modules, DNA Templates, and Primers (Table 1). Three primers are shownfor each vector module. The 2′-O-methyl residues are underlined. Whenplasmids were assembled from two components, the first two primers ineach set were used in PCR. When plasmids were assembled from threecomponents, the first and third primers in each set were used in PCR.

PCR. Each 100 μl reaction contained 50 pMol of each primer, 1×Pfu Buffer(10 mM (NH₄)₂SO₄, 20 mM Tris (pH8.8), 2 mM MgSO₄, 10 mM KCl, 0.1% TritonX-100 and 1 mg/ml bovine serum albumin), 1 mM additional MgSO₄, 0.3 mMof each dNTP, 5-10 ng of plasmid template and 1.25-1.85 units each ofcloned Pfu and Pfu Turbo polymerases (Stratagene, La Jolla, Calif.).Chimeric RNA/DNA primers were purchased from Oligo's Etc. (Willsonville,Oreg.). A typical step program for PCR was as follows: one cycle of 95°,3′; 52°, 2′ followed by 30 cycles of 95°, 30 sec; 52°, 30 sec; 72°, 2′.

Annealing Reaction. When annealing reactions were performed, PCRproducts were gel purified from a 1% agarose gel, using the Qiaquick gelextraction kit (Qiagen, Valencia, Calif.). Approximately 80 ng of eachfragment were combined in a 20 μl reaction that included 2 μl of 10×USBligation Buffer (660 mM Tris-HCL (pH7.6), 66 mM MgCl₂, 100 mM DTT, 660μM ATP) (USB, Cleveland, Ohio) was added make a 1× annealing mix. Thereaction was heated to 65° C. for 8 minutes, then cooled for 20 minutesto approximately 35-40° C. Samples were centrifuged briefly andincubated another 15 minutes at room temperature.

Transformation. Annealing reactions, when performed, were precipitatedby adding 100 μl of 100% ethanol, followed by a 15 minute incubation at−80° C., centrifugation, and 70% and 100% ethanol washes.Electrocompetent DH5α cells were transformed using a Biorad E. colipulser (Biorad, Hercules, Calif.); 5 μl of each annealing reaction werecombined with 40 μl of Electromax DH5α-E cells (Life Technologies,Rockville, Md.)

Simplified PCR Cloning. For simplified cloning experiments, we combined5 μl of the Kan PCR reaction with 5 μl of the Ori PCR reaction and addedthe sample to 100 μl of chemically competent E. coli cells. The mixturewas incubated on ice for 15′ followed by a 45 second heat shock at 42°C. Cells were incubated on ice for another 2′ before 800 μl of LB mediawas added. Cells were incubated with shaking for 45′ at 37° C. beforeplating.

Labeling of Primers. One nMol of the a DNA primer called 5′ Amp S/PE(5′-TGAGAGTGCACCATATGCG [SEQ ID NO: 45]) was combined with 5 μl of γ-³²Plabeled ATP (3000 Ci/mMol), 5 μl of 10×PNK buffer (330 mM Tris-acetate(pH 7.8), 660 mM potassium acetate, 100 mM magnesium acetate, and 5 mMDTT), 2 μl (20 units) T4 polynucleotide kinase (Epicentre, Madison,Wis.), and H₂O to 50 μl. Reactions were incubated for 30′ at 37° C.

Sequence Ladder. A DNA sequencing reaction was performed using theSequenase 2.0 DNA sequencing kit (USB, Cleveland, Ohio) with theAmp/ColE1 vector as a template. Reactions contained 0.5 μl of α-³²Plabeled ATP (800 Ci/mMol), 25 pMol primer and 1 μg DNA.

PCR Reactions. Each 100 μl Pfu PCR reaction contained 50 pMol of eachprimer, 1× Cloned Pfu Buffer (10 mM (NH₄)₂SO₄, 20 mM Tris (pH 8.8), 2 mMMgSO₄, 10 mM KCl, 0.1% Triton X-100 and 0.1 mg/ml bovine serum albumin),1 mM additional MgSO₄, 0.3 mM of each dNTP, 5-10 ng of plasmid templateand 1.25 units of both cloned Pfu and Pfu Turbo polymerases (Stratagene,La Jolla, Calif.). Pfu exo (−) reactions contained 2.5 units of Pfu exo(−) polymerase (Stratagene, La Jolla, Calif.); Each 100 μl Taq PCRreaction contained 50 pMol of each primer, 1×Taq buffer (10× ReactionBuffer without MgCl₂: 500 mM KCl, 100 mM Tris-HCl (pH 9.0 at 25° C.) and1.0% Triton® X-100), 1 mM MgCl₂, 0.1 mM of each dNTP, 5 units Taqpolymerase in storage buffer A (Promega, Madison, Wis.).

PCR Cycling. One cycle of 95°, 3′; 52°, 2′; 72°, 15 seconds, followed by30 cycles of 95°, 30 sec; 55°, 30 sec; 72°, 15 seconds. Chimeric RNA/DNAprimers were purchased from Oligo's Etc. (Willsonville, Oreg.).

Using the techniques set out above, we tested whether primers thatharbored a single 2′-O-methyl nucleotide rather than a singleribonucleotide could be used as terminator primers. These chimeric2′-O-methyl primers demonstrated greatly improved cloning efficiency. Weobtained 500,000 colonies per μg of DNA. Thus, cloning with primerscontaining a single 2′-O-methyl nucleotides is 25-fold more efficientthan cloning with primers that harbor single ribonucleotides. We testedeach of the four possible 2′-O-methyl nucleotides and found that each ofthe four nucleotides functions to generate an overhang for terminatorcloning.

In order to confirm the presence of the single stranded overhangs, andcompare termination efficiencies with different types of chimericprimers and polymerases, we designed an experiment that usedradiolabeled primers in terminator PCR (t-PCR) reactions. A ³²P labeledDNA reverse primer and a chimeric forward primer were used in a PCRreaction to generate a 100 base pair (bp) product. These reactions weredenatured and resolved on an 8% polyacrylamide gel (FIG. 13B). If thepolymerase terminates at the terminator nucleotide(s) an 85 nucleotidelabeled product is expected. Full read-through to the end of thechimeric primer would generate a 100 nucleotide labeled product. Thesequences of all chimeric primers were identical, and were varied onlywith respect to the type of nucleotide(s) comprising the terminatorpositions (i.e., ribo, deoxy, 2′-O-methyl). The same ³²P labeled DNAreverse primer was used for all experiments.

The following primers were tested: a DNA primer, an RNA/DNA primer withribonucleotides at positions 13, 14, 15, an RNA/DNA primer with a singleribonucleotide at position 15, and a primer with a single 2′-O-methylnucleotide at position 15. Pfu, Pfu(exo-), and Taq polymerases weretested.

The radiolabeled PCR experiments confirmed that there is a stop or pauseinduced by the terminator residues. PCR reactions using Pfu polymerasedemonstrated termination with all versions of the terminator primers asexpected. However, there was significant read through with the singleribonucleotide primer (FIG. 13B, lane 1). The experiment with Pfu(exo-)demonstrates that the 3′-5′ exonuclease activity of Pfu is important fortermination since Pfu(exo⁻), which lacks the 3′-5′ exonuclease activity,reads through the terminator with high efficiently (Lanes 5-7). PCRreactions using Taq polymerase showed little termination by RNAnucleotides, but did appear to be strongly stopped by the 2′-O-methylresidue.

We expected termination to occur one nucleotide prior to the terminatorresidue. Although Pfu terminated at that position, termination alsooccurred two nucleotides or three nucleotides prior to the expectedposition. Thus, termination generated a population of molecules, withtails of 15, 16 or 17 nucleotides.

An unexpected early termination was observed adjacent to the templateregion of both the DNA control primer and the single ribonucleotidechimeric primer (FIG. 13B, asterisk). This stop was not seen in thethree ribonucleotide or 2′-O-methyl t-PCR reactions. The mechanism ofthis unexpected termination is unknown.

We developed the t-PCR cloning method described above as a tool forcreating useful biological diversity by linking together modular DNAelements in a ligation independent manner. We tested this approach byusing the method to clone genes while simultaneously assembling thevectors into which the genes were cloned.

The modular vector system works in the following way. Vector elements(i.e., drug resistance genes, replication origins, or insert genes) areamplified with Pfu polymerase using 2′-O-methyl terminator primers. Theoverhangs on the amplified vector elements (modules) follow a specificset of rules that allow similar functional modules to be interchanged ina standardized orientation. Modules can be inserted or substituted intoexisting vectors or can be combined to construct new vectors.

Combining drug resistance and origin of replication modules with insertmodules produces cloned genes in customized vectors (FIG. 14). Weconstructed a set of vectors using two possible origins of replication,three possible drug resistance genes, and two different marker genes.These fragments were combined in all possible combinations to generate atotal of 18 distinct plasmids.

Six vectors were produced in two molecule reactions that contained noinsert (FIG. 14A) The GFP and LacZ containing vectors were assembled inthree molecule reactions (FIG. 14B), but were also assembled using twomolecule reactions in order to compare cloning efficiencies between twoand three molecule cloning. We find that two molecule assemblyexperiments are more efficient than three molecule assembly experiments.For example, cloning of GFP or LacZ via a two molecule experiment isapproximately 6-fold more efficient than cloning of the same gene via athree molecule experiment (1400 colonies per μg vs. 240 colonies perμg). Unexpectedly, however we find that the efficiency of two moleculeassembly varies significantly depending on the design of the experiment.The two molecule assembly experiments that combined a resistance genewith an origin of replication (FIG. 14A), were typically 450-fold moreefficient than the experiments that combined a marker gene (i.e., LacZor GFP) with an entire amplified plasmid. The reason for this differencein efficiency remains to be determined.

We have successfully completed thirty modular assembly experiments.Twenty-five of the plasmids were produced on the first attempt. Allrecombinants were confirmed by restriction mapping. Approximately 80% ofthe plasmids that were mapped (90/116) exhibited the expectedrecombinant restriction map. Fifteen ligation junctions were sequenced,and fourteen junctions were free of point mutations. We conclude thatt-PCR cloning provides a rapid and accurate method for gene cloning andmodular vector assembly.

Other Embodiments

Those of ordinary skill in the art will appreciate that the foregoingrepresents certain preferred embodiments of the invention, but is notintended to limit the spirit or scope of the following claims.

1. A method comprising steps of: performing a polymerase chain reactionin which one primer contains at least one 2′-O-methyl residue, such thatextension of the complementary strand terminates at the 2′-O-methylresidue.
 2. A nucleic acid molecule containing a double stranded portionand at least one single-stranded overhang, which nucleic acid moleculeis produced according to the method of step
 1. 3. A method comprisingsteps of: ligating the nucleic acid molecule of claim 2 to anothernucleic acid molecule.
 4. The method of claim 3, wherein the ligatingoccurs in vivo.
 5. The method of claim 3, wherein the ligating isperformed in vitro.
 6. The method of claim 3, wherein the ligating isperformed by adding the polymerase chain reaction mixture to competentcells without first isolating the nucleic acid.
 7. A nucleic acidmolecule produced according to the method of claim 3.